Singularities Affect Dynamics of Learning in Neuromanifolds

نویسندگان

  • Shun-ichi Amari
  • Hyeyoung Park
  • Tomoko Ozeki
چکیده

The parameter spaces of hierarchical systems such as multilayer perceptrons include singularities due to the symmetry and degeneration of hidden units. A parameter space forms a geometrical manifold, called the neuromanifold in the case of neural networks. Such a model is identified with a statistical model, and a Riemannian metric is given by the Fisher information matrix. However, the matrix degenerates at singularities. Such a singular structure is ubiquitous not only in multilayer perceptrons but also in the gaussian mixture probability densities, ARMA time-series model, and many other cases. The standard statistical paradigm of the Cramér-Rao theorem does not hold, and the singularity gives rise to strange behaviors in parameter estimation, hypothesis testing, Bayesian inference, model selection, and in particular, the dynamics of learning from examples. Prevailing theories so far have not paid much attention to the problem caused by singularity, relying only on ordinary statistical theories developed for regular (nonsingular) models. Only recently have researchers remarked on the effects of singularity, and theories are now being developed. This article gives an overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures. We demonstrate our recent results on these problems. Simple toy models are also used to show explicit solutions. We explain that the maximum likelihood estimator is no longer subject to the gaussian distribution even asymptotically, because the Fisher information matrix degenerates, that the model selection criteria such as AIC, BIC, and MDL fail to hold in these models, that a smooth Bayesian prior becomes singular in such models, and that the trajectories of dynamics of learning are strongly affected by the singularity, causing plateaus or slow manifolds in the parameter space. The natural gradient method is shown to perform well because it takes the singular geometrical structure into account. The generalization error and the training error are studied in some examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selection Criteria for Neuromanifolds of Stochastic Dynamics

We present ways of defining neuromanifolds – models of stochastic matrices – that are compatible with the maximization of an objective function such as the expected reward in reinforcement learning theory. Our approach is based on information geometry and aims at the reduction of model parameters with the hope to improve gradient learning processes.

متن کامل

Normal forms of Hopf Singularities: Focus Values Along with some Applications in Physics

This paper aims to introduce the original ideas of normal form theory and bifurcation analysis and control of small amplitude limit cycles in a non-technical terms so that it would be comprehensible to wide ranges of Persian speaking engineers and physicists. The history of normal form goes back to more than one hundreds ago, that is to the original ideas coming from Henry Poincare. This tool p...

متن کامل

Empowering Nurses: Explaining the Role of Organizational Learning and the Dynamics of the Learning Environment

Introduction: Researchers acknowledge that different educational and contextual factors can influence human resource development. Because of stress in hospital wards environment and lack of personnel’s support for student, personnel’s aggressive tempers, the supporting of nursing students is an important factor. Therefore, The present study aims to investigate the role of educational environmen...

متن کامل

Rings of Singularities

This paper is a slightly revised version of an introduction into singularity theory corresponding to a series of lectures given at the ``Advanced School and Conference on homological and geometrical methods in representation theory'' at the International Centre for Theoretical Physics (ICTP), Miramare - Trieste, Italy, 11-29 January 2010. We show how to associate to a triple of posit...

متن کامل

Dynamics of Learning Near Singularities in Layered Networks

We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especiall...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural computation

دوره 18 5  شماره 

صفحات  -

تاریخ انتشار 2006